智能论文笔记

Expressive Reasoning Graph Store: A Unified Framework for Managing RDF and Property Graph Databases

Sumit Neelam , Udit Sharma , Sumit Bhatia , Hima Karanam , Ankita Likhyani , Ibrahim Abdelaziz , Achille Fokoue , L. V. Subramaniam

分类：人工智能

2022-09-13

资源说明框架（RDF）和属性图（PG）是表示，存储和查询图数据的两个最常用的数据模型。我们提出了表达推理图存储（ERGS） - 构建在Janusgraph（属性图存储）顶部的图存储，该图还允许存储和查询RDF数据集。首先，我们描述了如何将RDF数据转换为属性图表示，然后描述将SPARQL查询转换为一系列Gremlin遍历的查询翻译模块。因此，开发的转换器和翻译器可以允许任何Apache TinkerPop符合图形数据库存储和查询RDF数据集。我们证明了使用JanusGraph作为基本属性图存储的建议方法的有效性，并将其性能与标准RDF系统进行比较。

translated by 谷歌翻译

Learning to Transpile AMR into SPARQL

Mihaela Bornea , Ramon Fernandez Astudillo , Tahira Naseem , Nandana Mihindukulasooriya , Ibrahim Abdelaziz , Pavan Kapanipathi , Radu Florian , Salim Roukos

分类：自然语言处理

2021-12-15

我们提出了一种基于转换的系统来转换摘要意义代表（AMR）进入SPARQL，了解知识库问题应答（KBQA）。这允许将抽象问题的一部分委派给强训练的语义解析器，同时使用少量配对数据学习转换。我们从最近的工作相关的AMR和SPARQL构造，而不是应用一套规则，我们教导BART模型选择性地使用这些关系。此外，在最近的语义解析作品之后，我们避免在BART的注意机制中进行了显式编码AMR，而是编码解析器状态。结果模型很简单，为其决策提供支持文本，并且优于LC-Quad（F1 53.4）中的基于AMR的KBQA中的最新进展，在QAL（F1 30.8）中匹配，同时利用相同的归纳偏差。

translated by 谷歌翻译

A Two-Stage Approach towards Generalization in Knowledge Base Question Answering

Srinivas Ravishankar , June Thai , Ibrahim Abdelaziz , Nandana Mihidukulasooriya , Tahira Naseem , Pavan Kapanipathi , Gaetano Rossilleo , Achille Fokoue

分类：自然语言处理 | 人工智能

2021-11-10

知识库问题的最现有的方法接听（KBQA）关注特定的基础知识库，原因是该方法的固有假设，或者因为在不同的知识库上评估它需要非琐碎的变化。然而，许多流行知识库在其潜在模式中的相似性份额可以利用，以便于跨知识库的概括。为了实现这一概念化，我们基于2级架构介绍了一个KBQA框架，该架构明确地将语义解析与知识库交互分开，促进了数据集和知识图中的转移学习。我们表明，具有不同潜在知识库的数据集预先灌注可以提供显着的性能增益并降低样本复杂性。我们的方法可实现LC-Quad（DBPedia），WEDQSP（FreeBase），简单问话（Wikidata）和MetaQA（WikiMovies-KG）的可比性或最先进的性能。

translated by 谷歌翻译

A Scalable AutoML Approach Based on Graph Neural Networks

Mossad Helali , Essam Mansour , Ibrahim Abdelaziz , Julian Dolby , Kavitha Srinivas

分类：机器学习

2021-10-29

Automl系统通过对有效的数据转换和学习者进行搜索以及为每个学习者进行超参数优化，从而自动构建机器学习模型。许多汽车系统使用元学习来指导搜索最佳管道。在这项工作中，我们提出了一个名为KGPIP的新颖的元学习系统，（1）通过通过程序分析挖掘数千个脚本来构建数据集和相应管道数据库，（2）使用数据集嵌入式来在数据库中找到基于数据库的类似数据集（3）在其内容上而不是基于元数据的功能上，模型Automl Pipeline创建作为图形生成问题，以简洁地表征单个数据集看到的各种管道。 KGPIP的元学习是汽车系统的子组件。我们通过将KGPIP与两个自动系统集成在一起来证明这一点。我们使用126个数据集的全面评估，包括最先进的系统使用的数据集，这表明KGPIP明显优于这些系统。

translated by 谷歌翻译

A Comprehensive Review on Autonomous Navigation

Saeid Nahavandi , Roohallah Alizadehsani , Darius Nahavandi , Shady Mohamed , Navid Mohajer , Mohammad Rokonuzzaman , Ibrahim Hossain

分类：机器人

2022-12-24

The field of autonomous mobile robots has undergone dramatic advancements over the past decades. Despite achieving important milestones, several challenges are yet to be addressed. Aggregating the achievements of the robotic community as survey papers is vital to keep the track of current state-of-the-art and the challenges that must be tackled in the future. This paper tries to provide a comprehensive review of autonomous mobile robots covering topics such as sensor types, mobile robot platforms, simulation tools, path planning and following, sensor fusion methods, obstacle avoidance, and SLAM. The urge to present a survey paper is twofold. First, autonomous navigation field evolves fast so writing survey papers regularly is crucial to keep the research community well-aware of the current status of this field. Second, deep learning methods have revolutionized many fields including autonomous navigation. Therefore, it is necessary to give an appropriate treatment of the role of deep learning in autonomous navigation as well which is covered in this paper. Future works and research gaps will also be discussed.

translated by 谷歌翻译

Anomaly Detection using Ensemble Classification and Evidence Theory

Fernando Arévalo , Tahasanul Ibrahim , Christian Alison M. Piolo , Andreas Schwung

分类：机器学习

2022-12-23

Multi-class ensemble classification remains a popular focus of investigation within the research community. The popularization of cloud services has sped up their adoption due to the ease of deploying large-scale machine-learning models. It has also drawn the attention of the industrial sector because of its ability to identify common problems in production. However, there are challenges to conform an ensemble classifier, namely a proper selection and effective training of the pool of classifiers, the definition of a proper architecture for multi-class classification, and uncertainty quantification of the ensemble classifier. The robustness and effectiveness of the ensemble classifier lie in the selection of the pool of classifiers, as well as in the learning process. Hence, the selection and the training procedure of the pool of classifiers play a crucial role. An (ensemble) classifier learns to detect the classes that were used during the supervised training. However, when injecting data with unknown conditions, the trained classifier will intend to predict the classes learned during the training. To this end, the uncertainty of the individual and ensemble classifier could be used to assess the learning capability. We present a novel approach for novel detection using ensemble classification and evidence theory. A pool selection strategy is presented to build a solid ensemble classifier. We present an architecture for multi-class ensemble classification and an approach to quantify the uncertainty of the individual classifiers and the ensemble classifier. We use uncertainty for the anomaly detection approach. Finally, we use the benchmark Tennessee Eastman to perform experiments to test the ensemble classifier's prediction and anomaly detection capabilities.

translated by 谷歌翻译

Adapting to Latent Subgroup Shifts via Concepts and Proxies

Ibrahim Alabdulmohsin , Nicole Chiou , Alexander D'Amour , Arthur Gretton , Sanmi Koyejo , Matt J. Kusner , Stephen R. Pfohl , Olawale Salaudeen , Jessica Schrouff , Katherine Tsai

分类： (统计)机器学习 | 人工智能 | 机器学习

2022-12-21

We address the problem of unsupervised domain adaptation when the source domain differs from the target domain because of a shift in the distribution of a latent subgroup. When this subgroup confounds all observed data, neither covariate shift nor label shift assumptions apply. We show that the optimal target predictor can be non-parametrically identified with the help of concept and proxy variables available only in the source domain, and unlabeled data from the target. The identification results are constructive, immediately suggesting an algorithm for estimating the optimal predictor in the target. For continuous observations, when this algorithm becomes impractical, we propose a latent variable model specific to the data generation process at hand. We show how the approach degrades as the size of the shift changes, and verify that it outperforms both covariate and label shift adjustment.

translated by 谷歌翻译

High-resolution canopy height map in the Landes forest (France) based on GEDI, Sentinel-1, and Sentinel-2 data with a deep learning approach

Martin Schwartz , Philippe Ciais , Catherine Ottlé , Aurelien De Truchis , Cedric Vega , Ibrahim Fayad , Martin Brandt , Rasmus Fensholt , Nicolas Baghdadi , François Morneau

分类：计算机视觉

2022-12-20

In intensively managed forests in Europe, where forests are divided into stands of small size and may show heterogeneity within stands, a high spatial resolution (10 - 20 meters) is arguably needed to capture the differences in canopy height. In this work, we developed a deep learning model based on multi-stream remote sensing measurements to create a high-resolution canopy height map over the "Landes de Gascogne" forest in France, a large maritime pine plantation of 13,000 km$^2$ with flat terrain and intensive management. This area is characterized by even-aged and mono-specific stands, of a typical length of a few hundred meters, harvested every 35 to 50 years. Our deep learning U-Net model uses multi-band images from Sentinel-1 and Sentinel-2 with composite time averages as input to predict tree height derived from GEDI waveforms. The evaluation is performed with external validation data from forest inventory plots and a stereo 3D reconstruction model based on Skysat imagery available at specific locations. We trained seven different U-net models based on a combination of Sentinel-1 and Sentinel-2 bands to evaluate the importance of each instrument in the dominant height retrieval. The model outputs allow us to generate a 10 m resolution canopy height map of the whole "Landes de Gascogne" forest area for 2020 with a mean absolute error of 2.02 m on the Test dataset. The best predictions were obtained using all available satellite layers from Sentinel-1 and Sentinel-2 but using only one satellite source also provided good predictions. For all validation datasets in coniferous forests, our model showed better metrics than previous canopy height models available in the same region.

translated by 谷歌翻译

Anticancer Peptides Classification using Kernel Sparse Representation Classifier

Ehtisham Fazal , Muhammad Sohail Ibrahim , Seongyong Park , Imran Naseem , Abdul Wahab

分类：机器学习

2022-12-19

Cancer is one of the most challenging diseases because of its complexity, variability, and diversity of causes. It has been one of the major research topics over the past decades, yet it is still poorly understood. To this end, multifaceted therapeutic frameworks are indispensable. \emph{Anticancer peptides} (ACPs) are the most promising treatment option, but their large-scale identification and synthesis require reliable prediction methods, which is still a problem. In this paper, we present an intuitive classification strategy that differs from the traditional \emph{black box} method and is based on the well-known statistical theory of \emph{sparse-representation classification} (SRC). Specifically, we create over-complete dictionary matrices by embedding the \emph{composition of the K-spaced amino acid pairs} (CKSAAP). Unlike the traditional SRC frameworks, we use an efficient \emph{matching pursuit} solver instead of the computationally expensive \emph{basis pursuit} solver in this strategy. Furthermore, the \emph{kernel principal component analysis} (KPCA) is employed to cope with non-linearity and dimension reduction of the feature space whereas the \emph{synthetic minority oversampling technique} (SMOTE) is used to balance the dictionary. The proposed method is evaluated on two benchmark datasets for well-known statistical parameters and is found to outperform the existing methods. The results show the highest sensitivity with the most balanced accuracy, which might be beneficial in understanding structural and chemical aspects and developing new ACPs. The Google-Colab implementation of the proposed method is available at the author's GitHub page (\href{https://github.com/ehtisham-Fazal/ACP-Kernel-SRC}{https://github.com/ehtisham-fazal/ACP-Kernel-SRC}).

translated by 谷歌翻译

An Extension of Fisher's Criterion: Theoretical Results with a Neural Network Realization

Ibrahim Alsolami , Tomoki Fukai

分类：机器学习 | 计算机视觉

2022-12-19

Fisher's criterion is a widely used tool in machine learning for feature selection. For large search spaces, Fisher's criterion can provide a scalable solution to select features. A challenging limitation of Fisher's criterion, however, is that it performs poorly when mean values of class-conditional distributions are close to each other. Motivated by this challenge, we propose an extension of Fisher's criterion to overcome this limitation. The proposed extension utilizes the available heteroscedasticity of class-conditional distributions to distinguish one class from another. Additionally, we describe how our theoretical results can be casted into a neural network framework, and conduct a proof-of-concept experiment to demonstrate the viability of our approach to solve classification problems.

translated by 谷歌翻译